OCPBUGS-63219: Add clientIPPreservationMode to AWS NLB parameters#2661
OCPBUGS-63219: Add clientIPPreservationMode to AWS NLB parameters#2661gcs278 wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Skipping CI for Draft Pull Request. |
📝 WalkthroughWalkthroughAdds an exported string enum 🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Hello @gcs278! Some important instructions when contributing to openshift/api: |
|
@gcs278: This pull request references Jira Issue OCPBUGS-63219, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
I've marked as draft as I'm still working through the approach to fixing this bug. /test all |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
b27f43f to
8eafa13
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
operator/v1/types_ingress.go (1)
914-921: Encode the documented default in schema annotations.Line 914 documents a default mode, but the field has no
+kubebuilder:defaultmarker. Adding it keeps CRD/OpenAPI behavior explicit and consistent for clients.Suggested annotation
// When omitted, the default behavior is "Native". // + // +kubebuilder:default:="Native" // +optional ClientIPPreservationMode ClientIPPreservationMode `json:"clientIPPreservationMode,omitempty"`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@operator/v1/types_ingress.go` around lines 914 - 921, The comment documents the default "Native" for the ClientIPPreservationMode field but the CRD schema lacks a kubebuilder default; add a +kubebuilder:default="Native" marker immediately above the ClientIPPreservationMode field declaration (ClientIPPreservationMode ClientIPPreservationMode `json:"clientIPPreservationMode,omitempty"`), ensuring the OpenAPI/CRD schema and type ClientIPPreservationMode encode the documented default.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@operator/v1/types_ingress.go`:
- Around line 905-907: The exported enum label for the NLB client IP mode is
inconsistent (code uses "Native" but the API should default to "Preserved");
update the enum constant/strings and documentation so the exported value is
"Preserved" everywhere, set the default to Preserved, and adjust all
validation/serialization code to accept and emit "Preserved" instead of "Native"
(update the enum declaration, Allowed/Valid list, defaulting in SetDefaults,
Validate, and any UnmarshalJSON/FromString helpers or switch cases that
reference "Native"); if you need to preserve backward-compatibility, add a
one-time alias mapping from the legacy "Native" input to "Preserved" in
Unmarshal/Validate while continuing to output "Preserved".
---
Nitpick comments:
In `@operator/v1/types_ingress.go`:
- Around line 914-921: The comment documents the default "Native" for the
ClientIPPreservationMode field but the CRD schema lacks a kubebuilder default;
add a +kubebuilder:default="Native" marker immediately above the
ClientIPPreservationMode field declaration (ClientIPPreservationMode
ClientIPPreservationMode `json:"clientIPPreservationMode,omitempty"`), ensuring
the OpenAPI/CRD schema and type ClientIPPreservationMode encode the documented
default.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 26a1373d-68a6-4e6e-830d-33b241a22822
⛔ Files ignored due to path filters (10)
openapi/generated_openapi/zz_generated.openapi.gois excluded by!openapi/**,!**/zz_generated*openapi/openapi.jsonis excluded by!openapi/**operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-CustomNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/*operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-Default.crd.yamlis excluded by!**/zz_generated.crd-manifests/*operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-DevPreviewNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/*operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-OKD.crd.yamlis excluded by!**/zz_generated.crd-manifests/*operator/v1/zz_generated.crd-manifests/0000_50_ingress_00_ingresscontrollers-TechPreviewNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/*operator/v1/zz_generated.featuregated-crd-manifests/ingresscontrollers.operator.openshift.io/AAA_ungated.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**operator/v1/zz_generated.featuregated-crd-manifests/ingresscontrollers.operator.openshift.io/IngressControllerDynamicConfigurationManager.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**operator/v1/zz_generated.swagger_doc_generated.gois excluded by!**/zz_generated*
📒 Files selected for processing (1)
operator/v1/types_ingress.go
8eafa13 to
e6b120a
Compare
|
@gcs278: This pull request references Jira Issue OCPBUGS-63219, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/remove-lifecycle stale |
e6b120a to
cff0427
Compare
|
/jira refresh |
|
@gcs278: This pull request references Jira Issue OCPBUGS-63219, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Add clientIPPreservationMode field to AWSNetworkLoadBalancerParameters to control how client IP addresses are preserved. The field accepts "Native" (uses AWS's native client IP preservation) and "ProxyProtocol" (uses PROXY protocol v2, the new default). When set to Native, the NLB target group has preserve_client_ip.enabled set to true, which may cause hairpin connection failures for internal load balancers when connections are made from pods to router pods on the same node. When set to ProxyProtocol, the NLB target group has preserve_client_ip.enabled set to false and proxy_protocol_v2.enabled set to true. This allows backends to receive the original client IP via PROXY protocol headers while avoiding hairpin connection failures. https://redhat.atlassian.net/browse/OCPBUGS-63219
cff0427 to
1fb56fc
Compare
|
@gcs278: This pull request references Jira Issue OCPBUGS-63219, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@JoelSpeed I was discussing with @Miciah and I'm going to write up a quick design doc for this API and the defaulting behavioral changes so we can discuss in more detail. Feel free to review/comment in the meantime though. |
|
/assign @yuqi-zhang |
|
/hold While we discuss design details. But feel free to still review, or review my doc. |
yuqi-zhang
left a comment
There was a problem hiding this comment.
Let me know when the design is finalized - just some initial thoughts:
- the API definition itself looks fine, although simple, it would be best to have a few unit tests attached
- since this is introducing a new field to a stable API, the general expectation is that you would have a featuregate backing this, however
- it seems that you are doing this as a bugfix patch. Are you planning on backporting this to previous versions? We generally do not backport new API fields. Based on your description
AWS NLBs have preserve_client_ip.enabled=true by default, which causes hairpin connection failures on internal NLBs when a pod sends traffic through the NLB and it routes back to the same node. The fix is to disable native client IP preservation and use PROXY protocol v2 instead — the same mechanism CLBs already use.
I would expect previous versioned clusters running into this to find a way to do this via some manual method, if possible, instead of backporting a new API field and fix. This can serve as the fix for future clusters.
|
/assign @Miciah |
Summary
Note: Design Doc with an alternative is here: https://docs.google.com/document/d/14UVY8U-exch30-tYloX_-DMhM7hJX8226VA2q7bTJeo/edit?tab=t.0
Add
clientIPPreservationModefield toAWSNetworkLoadBalancerParametersto control how client IP addresses are preserved. The field accepts "Native" (uses AWS's native client IP preservation) and "ProxyProtocol" (uses PROXY protocol v2, the new default).When set to Native, the NLB target group has
preserve_client_ip.enabledset totrue, which may cause hairpin connection failures for internal load balancers when connections are made from pods to router pods on the same node.When set to ProxyProtocol, the NLB target group has
preserve_client_ip.enabledset tofalseandproxy_protocol_v2.enabledset totrue. This allows backends to receive the original client IP via PROXY protocol headers while avoiding hairpin connection failures.Why a new API field?
AWS NLBs have
preserve_client_ip.enabled=trueby default, which causes hairpin connection failures on internal NLBs when a pod sends traffic through the NLB and it routes back to the same node. The fix is to disable native client IP preservation and use PROXY protocol v2 instead — the same mechanism CLBs already use.However, simply changing the default NLB behavior is not safe for existing clusters:
An opt-in API field lets users explicitly choose ProxyProtocol when they need hairpin to work, while preserving the existing behavior for clusters that are working fine.
Why controller-managed defaults (not CRD defaults)?
Per the OpenShift API conventions for configuration APIs, IngressController fields should be defaulted in the controller rather than via
+kubebuilder:default. This is because CRD defaults are applied on read — meaning a+kubebuilder:defaultwould retroactively apply to existing IngressControllers on upgrade, changing their behavior without the user's knowledge.Instead, the ingress operator defaults
clientIPPreservationModetoProxyProtocolonly when creating new IngressControllers. Existing IngressControllers that were created before this field existed will have the field omitted, and the operator treats omitted asNative— preserving their current behavior with no change on upgrade.The godoc uses the standard conventions wording to reserve the right to change the default in the future: "When omitted, this means the user has no opinion and the value is left to the platform to choose a good default, which is subject to change over time. The current default is ProxyProtocol."
Implementation PR: openshift/cluster-ingress-operator#1426